Aletras, Nikolaos, Timothy Baldwin, Jey Han Lau and Mark Stevenson (to appear) Representing Topics Labels for Exploring Digital Libraries, In Proceedings of Digital Libraries 2014, London, UK

نویسندگان

  • Nikolaos Aletras
  • Timothy Baldwin
  • Jey Han Lau
  • Mark Stevenson
چکیده

Topic models have been shown to be a useful way of representing the content of large document collections, for example via visualisation interfaces (topic browsers). These systems enable users to explore collections by way of latent topics. A standard way to represent a topic is using a set of keywords, i.e. the top-n words with highest marginal probability within the topic. However, alternative topic representations have been proposed, including textual and image labels. In this paper, we compare different topic representations, i.e. sets of topic words, textual phrases and images, in a document retrieval task. We asked participants to retrieve relevant documents based on pre-defined queries within a fixed time limit, presenting topics in one of the following modalities: (1) sets of keywords, (2) textual labels, and (3) image labels. Our results show that textual labels are easier for users to interpret than keywords and image labels. Moreover, the precision of retrieved documents for textual and image labels is comparable to the precision achieved by representing topics using sets of keywords, demonstrating that labelling methods are an effective alternative topic representation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Topic Labelling

Topics generated by topic models are typically presented as a list of topic terms. Automatic topic labelling is the task of generating a succinct label that summarises the theme or subject of a topic, with the intention of reducing the cognitive load of end-users when interpreting these topics. Traditionally, topic label systems focus on a single label modality, e.g. textual labels. In this wor...

متن کامل

Labelling Topics using Unsupervised Graph-based Methods

This paper introduces an unsupervised graph-based method that selects textual labels for automatically generated topics. Our approach uses the topic keywords to query a search engine and generate a graph from the words contained in the results. PageRank is then used to weigh the words in the graph and score the candidate labels. The state-of-the-art method for this task is supervised (Lau et al...

متن کامل

Automatic Labelling of Topics with Neural Embeddings

Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea. Using Wikipedia document titles as label candidates, we compute neural embeddings for documents and words to select the most relevant labels for topics. Com...

متن کامل

Proposed content framework for digital literacy education to users in Iran

Aim: today, digital literacy, as a set of skills that enable people to use digital space effectively for success in personal, educational and professional life, has become a necessity in all societies and public libraries are one of the most important providers of digital literacy education in the world. Digital literacy education has not been considered in public libraries in Iran. The first s...

متن کامل

Newman, David, Youn Noh, Edmund Talley, Sarvnaz Karimi and Timothy Baldwin (2010) Evaluating Topic Models for Digital Libraries, In Proceedings of the 2010 Joint Conference on Digital Libraries (JCDL), Gold Coast, Australia, pp. 215¡1⁄2224

Topic models could have a huge impact on improving the ways users find and discover content in digital libraries and search interfaces, through their ability to automatically learn and apply subject tags to each and every item in a collection, and their ability to dynamically create virtual collections on the fly. However, much remains to be done to tap this potential, and empirically evaluate ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014